Improved Spoken Query Transcription Using Co-Occurrence Information
نویسندگان
چکیده
Spoken queries are a natural medium for searching the Mobile Web. Language modeling for voice search recognition offers different challenges compared to more conventional speech applications. The challenges arise from the fact that spoken queries are usually a set of keywords and do not have a syntactic and grammatical structure. This paper describes a cooccurrence based approach to improve the accuracy of voice queries automatic transcription. With the right choice of scoring function and co-occurrence level, we show that co-occurrence information gives a 2% relative accuracy improvement over a state of the art system.
منابع مشابه
New Developments in Spoken Query Transcription
The rapid growth of mobile devices with the ability to browse the Internet has opened up interesting application areas for speech and natural language processing technologies. Voice search is one such application where speech technology is making a big impact by enabling people to access the Internet conveniently from mobile devices. Spoken queries are a natural medium for searching the Mobile ...
متن کاملSegmented Spoken Document Retrieval Using Word Co-occurrence Information
This paper shows several approaches for NTCIR-11 SpokenQuery&Doc [1]. This paper proposes several schemes to use word co-occurrence information for spoken document retrieval. Automatic transcriptions of spoken documents usually contain mis-recognized words, making the performance of spoken document retrieval signi cantly decrease. The cosine similarity to measure a document similarity must be i...
متن کاملSpeech-Based Retrieval Using Semantic Co-Occurrence Filtering
In this paper we demonstra te that speech recognition can be effectively applied to information retrieval (IR) applications. Our system exploits the fact that the intended words of a spoken query tend to co-occur in text documents in close proximity whereas word combinations that are the result of recognition errors are usually not semantically correlated and thus do not appear together. Termed...
متن کاملEffects of Query Expansion for Spoken Document Passage Retrieval
One of the major challenges for spoken document retrieval is how to handle speech recognition errors within the target documents. Query expansion is promising for this challenge. In this paper, we apply relevance models, a type of query expansion method, for the spoken document passage retrieval task. We adapted the original relevance model for passage retrieval. We also extended it to benefit ...
متن کاملImproved Skips for Faster Postings List Intersection
Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...
متن کامل